Fast Cross-Validation via Sequential Analysis

نویسندگان

  • Tammo Krueger
  • Danny Panknin
  • Mikio Braun
چکیده

With the increasing size of today’s data sets, finding the right parameter configuration via cross-validation can be an extremely time-consuming task. In this paper we propose an improved cross-validation procedure which uses non-parametric testing coupled with sequential analysis to determine the best parameter set on linearly increasing subsets of the data. By eliminating underperforming candidates quickly and keeping promising candidates as long as possible the method speeds up the computation while preserving the capability of the full cross-validation. The experimental evaluation shows that our method reduces the computation time by a factor of up to 70 compared to a full cross-validation with a negligible impact on the accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast cross-validation via sequential testing

With the increasing size of today’s data sets, finding the right parameter configuration in model selection via cross-validation can be an extremely time-consuming task. In this paper we propose an improved cross-validation procedure which uses nonparametric testing coupled with sequential analysis to determine the best parameter set on linearly increasing subsets of the data. By eliminating un...

متن کامل

Fast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets

Biomedical datasets usually include a large number of features relative to the number of samples. However, some data dimensions may be less relevant or even irrelevant to the output class. Selection of an optimal subset of features is critical, not only to reduce the processing cost but also to improve the classification results. To this end, this paper presents a hybrid method of filter and wr...

متن کامل

A TRUST-REGION SEQUENTIAL QUADRATIC PROGRAMMING WITH NEW SIMPLE FILTER AS AN EFFICIENT AND ROBUST FIRST-ORDER RELIABILITY METHOD

The real-world applications addressing the nonlinear functions of multiple variables could be implicitly assessed through structural reliability analysis. This study establishes an efficient algorithm for resolving highly nonlinear structural reliability problems. To this end, first a numerical nonlinear optimization algorithm with a new simple filter is defined to locate and estimate the most ...

متن کامل

Quantification of the Impact of Feature Selection on the Variance of Cross-Validation Error Estimation

Given the relatively small number of microarrays typically used in gene-expression-based classification, all of the data must be used to train a classifier and therefore the same training data is used for error estimation. The key issue regarding the quality of an error estimator in the context of small samples is its accuracy, and this is most directly analyzed via the deviation distribution o...

متن کامل

Sequential Data Assimilation with Sigma-point Kalman Filter on Low-dimensional Manifold

In order to address the highly nonlinear dynamics in estuary flow, we propose a novel data assimilation system based on components designed to accurately capture nonlinear dynamics. The core of the system is a sigma-point Kalman filter coupled to a fast, neural network emulator of the flow dynamics. In order to be computationally feasible, the entire system operates on a low-dimensional subspac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011